Skip to content

Revert "[SPARK-56975][SS] Reject user-specified schema in DataStreamReader.table()"#56189

Closed
PorridgeSwim wants to merge 2 commits into
apache:masterfrom
PorridgeSwim:revert-SPARK-56975
Closed

Revert "[SPARK-56975][SS] Reject user-specified schema in DataStreamReader.table()"#56189
PorridgeSwim wants to merge 2 commits into
apache:masterfrom
PorridgeSwim:revert-SPARK-56975

Conversation

@PorridgeSwim
Copy link
Copy Markdown
Contributor

@PorridgeSwim PorridgeSwim commented May 28, 2026

What changes were proposed in this pull request?

This reverts commit 05b4d81f3f938ff140886d6f66ad66d08c66d5b2 (SPARK-56975), which made DataStreamReader.table() reject a user-specified schema by calling assertNoSpecifiedSchema("table"). This restores the previous behavior, where a user-specified schema passed before .table() is accepted (and ignored).

Why are the changes needed?

SPARK-56975 is a behavior-breaking change. Code that previously ran successfully — e.g. spark.readStream.schema(s).table(name) — now throws an AnalysisException (_LEGACY_ERROR_TEMP_1189). While a schema has no effect on .table(), rejecting it outright breaks existing user workloads that set a schema on the DataStreamReader before calling .table().

A user-facing behavior change like this must go through the project's breaking-change process, which was not followed for SPARK-56975. We are reverting it to restore backward compatibility; a proper deprecation path can be pursued separately if the stricter behavior is still desired.

Does this PR introduce any user-facing change?

Yes. It restores the pre-SPARK-56975 behavior: DataStreamReader.table() again accepts (and silently ignores) a user-specified schema instead of throwing AnalysisException (_LEGACY_ERROR_TEMP_1189). Since SPARK-56975 only landed in unreleased branches (master and branch-4.2), there is no change relative to any released Spark version.

How was this patch tested?

This is a straight git revert. Existing DataStreamTableAPISuite tests pass; the test added by SPARK-56975 ("read: user-specified schema is not allowed with table API") is removed as part of the revert.

Was this patch authored or co-authored using generative AI tooling?

No.

@PorridgeSwim PorridgeSwim changed the title Revert "[SPARK-56975][SS] Reject user-specified schema in DataStreamR… Revert "[SPARK-56975][SS] Reject user-specified schema in DataStreamReader.table()" May 28, 2026
anishshri-db pushed a commit that referenced this pull request May 30, 2026
…eader.table()"

### What changes were proposed in this pull request?
This reverts commit `05b4d81f3f938ff140886d6f66ad66d08c66d5b2` (SPARK-56975), which made `DataStreamReader.table()` reject a user-specified schema by calling `assertNoSpecifiedSchema("table")`. This restores the previous behavior, where a user-specified schema passed before `.table()` is accepted (and ignored).

### Why are the changes needed?
SPARK-56975 is a behavior-breaking change. Code that previously ran successfully — e.g. `spark.readStream.schema(s).table(name)` — now throws an `AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). While a schema has no effect on `.table()`, rejecting it outright breaks existing user workloads that set a schema on the `DataStreamReader` before calling `.table()`.

A user-facing behavior change like this must go through the project's breaking-change process, which was not followed for SPARK-56975. We are reverting it to restore backward compatibility; a proper deprecation path can be pursued separately if the stricter behavior is still desired.

### Does this PR introduce _any_ user-facing change?
Yes. It restores the pre-SPARK-56975 behavior: `DataStreamReader.table()` again accepts (and silently ignores) a user-specified schema instead of throwing `AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). Since SPARK-56975 only landed in unreleased branches (`master` and `branch-4.2`), there is no change relative to any released Spark version.

### How was this patch tested?
This is a straight `git revert`. Existing `DataStreamTableAPISuite` tests pass; the test added by SPARK-56975 (`"read: user-specified schema is not allowed with table API"`) is removed as part of the revert.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #56189 from PorridgeSwim/revert-SPARK-56975.

Lead-authored-by: You Zhou <you.zhou@databricks.com>
Co-authored-by: You Zhou <98635051+PorridgeSwim@users.noreply.github.com>
Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
(cherry picked from commit 6039af8)
Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
anishshri-db pushed a commit that referenced this pull request May 30, 2026
…eader.table()"

### What changes were proposed in this pull request?
This reverts commit `05b4d81f3f938ff140886d6f66ad66d08c66d5b2` (SPARK-56975), which made `DataStreamReader.table()` reject a user-specified schema by calling `assertNoSpecifiedSchema("table")`. This restores the previous behavior, where a user-specified schema passed before `.table()` is accepted (and ignored).

### Why are the changes needed?
SPARK-56975 is a behavior-breaking change. Code that previously ran successfully — e.g. `spark.readStream.schema(s).table(name)` — now throws an `AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). While a schema has no effect on `.table()`, rejecting it outright breaks existing user workloads that set a schema on the `DataStreamReader` before calling `.table()`.

A user-facing behavior change like this must go through the project's breaking-change process, which was not followed for SPARK-56975. We are reverting it to restore backward compatibility; a proper deprecation path can be pursued separately if the stricter behavior is still desired.

### Does this PR introduce _any_ user-facing change?
Yes. It restores the pre-SPARK-56975 behavior: `DataStreamReader.table()` again accepts (and silently ignores) a user-specified schema instead of throwing `AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). Since SPARK-56975 only landed in unreleased branches (`master` and `branch-4.2`), there is no change relative to any released Spark version.

### How was this patch tested?
This is a straight `git revert`. Existing `DataStreamTableAPISuite` tests pass; the test added by SPARK-56975 (`"read: user-specified schema is not allowed with table API"`) is removed as part of the revert.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #56189 from PorridgeSwim/revert-SPARK-56975.

Lead-authored-by: You Zhou <you.zhou@databricks.com>
Co-authored-by: You Zhou <98635051+PorridgeSwim@users.noreply.github.com>
Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
(cherry picked from commit 6039af8)
Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants